SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection

نویسندگان

چکیده

Convolutional neural networks (CNNs) are good at extracting contexture features within certain receptive fields, while transformers can model the global long-range dependency features. By absorbing advantage of transformer and merit CNN, Swin Transformer shows strong feature representation ability. Based on it, we propose a cross-modality fusion model, SwinNet , for RGB-D RGB-T salient object detection. It is driven by to extract hierarchical features, boosted attention mechanism bridge gap between two modalities, guided edge information sharp contour object. To be specific, two-stream encoder first extracts multi-modality then spatial alignment channel re-calibration module presented optimize intra-level clarify fuzzy boundary, edge-guided decoder achieves inter-level under guidance The proposed outperforms state-of-the-art models datasets, showing that it provides more insight into complementarity task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Elastic Edge Boxes for Object Proposal on RGB-D Images

Object proposal is utilized as a fundamental preprocessing of various multimedia applications by detecting the candidate regions of objects in images. In this paper, we propose a novel object proposal method, named elastic edge boxes, integrating window scoring and grouping strategies and utilizing both color and depth cues in RGBD images. We first efficiently generate the initial bounding boxe...

متن کامل

Object proposal on RGB-D images via elastic edge boxes

As a fundamental preprocessing of various multimedia applications, object proposal aims to detect the candidate windows possibly containing arbitrary objects in images with two typical strategies, window scoring and grouping. In this paper, we first analyze the feasibility of improving object proposal performance by integrating window scoring and grouping strategies. Then, we propose a novel ob...

متن کامل

RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning

In this work, we propose to utilize Convolutional Neural Networks (CNNs) to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the " data-hungry " nature of CNNs and the unavailability of suff...

متن کامل

Local Background Enclosure for RGB-D Salient Object Detection - Supplementary Results

The purpose of this supplementary material is to examine in detail the contributions of our proposed Local Background Enclosure (LBE) feature. A comparison of LBE with the contrast based depth features used in state-of-the-art salient object detection systems is presented. The LBE feature is compared with the raw depth features ACSD [1], DC [3] and a signed version of DC denoted SDC on the RGBD...

متن کامل

Depth-aware CNN for RGB-D Segmentation

Convolutional neural networks (CNN) are limited by the lack of capability to handle geometric information due to the fixed grid kernel structure. The availability of depth data enables progress in RGB-D semantic segmentation with CNNs. State-of-the-art methods either use depth as additional images or process spatial information in 3D volumes or point clouds. These methods suffer from high compu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology

سال: 2022

ISSN: ['1051-8215', '1558-2205']

DOI: https://doi.org/10.1109/tcsvt.2021.3127149